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In this study, we have investigated genome-wide occurrence of Histone Acetyltransferases (HATs) in 
genomes of Mus musculus and Danio rerio on the basis of presence of HAT domain. Our study identified a 
group of proteins that lacks characteristic features of known HAT families, relatively smaller in size and has 
no other associated domains. Most of the proteins in this unclassified group are Camello proteins, which are 
not yet known and classified as functional HATs. Our in vitro and in vivo analysis revealed that Camello 
family proteins are active HATs and exhibit specificity towards histone H4. Interestingly, Camello proteins 
are among the first identified HATs showing perinuclear localization. Moreover, Camello proteins are 
evolutionarily conserved in all chordates and are observed for the first time in cnidarians in phylogeny. 
Furthermore, knockdown of Camello protein (CML03) in zebrafish embryos exhibited defects in axis 
elongation and head formation. Thus, our study identified a novel family of active HATs that is specific for 
histone H4 acetylation, exhibits perinuclear localization and is essential for zebrafish development. 

Gene expression in eukaryotes is a tightly controlled process involving a complex interplay between 
chromatin proteins and transcription factors. The functional availability of these factors and accessibility 
of DNA sequence define the state of gene activation or repression. DNA in chromatin is wrapped around 
histone octamers comprising of two copies each of the four core histone proteins (H2A, H2B, H3 and H4) to form 
discrete nucleosome units. The N-terminal tails of these core-histones protrude from the nucleosome particles 
and are subjected to various post-translational modifications such as acetylation, methylation, phosphorylation 
and ubiqutination^'^. 

Histone acetylation by histone acetyltransferases (HATs) is one of the most extensively studied covalent 
histone modifications. HATs modify physico-chemical properties of core histones through acetylation, influence 
the nucleosome structure and participate in transcription regulation. However, many HATs can act on non- 
histone proteins (cytoplasmic as well as nuclear) and are now renamed as lysine acetyltransferases (KATs)^. 
Acetylation of core-histone and non-histone proteins is correlated with various cellular processes such as tran- 
scription regulation, chromatin assembly, DNA repair and cell cycle progression^. 

Characterization of HATs on the basis of protein sequence and domain organization reveals five distinct 
families of HATsl (i) Largest of these families is the GNAT (GCN5-related N-acetyltransferase) family whose 
members share a highly conserved acetylation-related structural motif. GCN5, one of the members of the GNAT 
family is the best-characterized HAT protein and serves as a prototype for histone acetyltransferase studies. One 
of the characteristic features of the GNAT family is a carboxy-terminal bromo-domain, which helps in targeting 
proteins to the substrate^. GNAT family proteins are also known to acetylate non-histone proteins as well as small 
molecules^, (ii) Another family is the MYST (MOZ, Ybf2/Sas3, Sas2 and Tip60) family, which also has an 
acetylation-related structural motif. Many of the MYST family proteins contain zinc fingers as well as 
chromo- domain^. Presence of chromo-domain in the MYST family suggests that they might interact with the 
heterochromatin-associated proteins^. GNAT and MYST families contain dozens of lysine acetyltransferase 
enzymes and are mostly part of multi-subunit transcriptional co-activator complexes, (iii) The P300/CBP 
(CREB-binding protein) family consists of two paralogous proteins, P300 and CBP. These two proteins have 
interchangeable functions. Members of the P300/CBP family contain many functional domains including 
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acetylation-related structural motif which is involved in acetyl- Co A 
binding, three zinc finger regions and a bromo-domain. P300/CBP 
act as co-activators and harbor domains for interaction with many 
transcription factors^, (iv) The fourth group of HATs is the basal 
transcription factor family, which is related to mammalian 
TAFII250, the largest subunit of the transcription factor complex 
TFIID^'^°. Basal transcription factor family proteins also act as 
HATs but do not harbor acetylation related structural motif (v) 
Last of the HAT families is the nuclear receptor cofactors family, 
which is largely specific to mammals^. Members of this family 
include nuclear receptor co-activators such as steroid receptor co- 
activators (SRCl) and clock circadian regulator (CLOCK). This 
family of HATs is also functionally known to act as HAT but they 
do not have any acetylation related structural motif ^"^^ 

Here, we performed genome- wide survey of lysine acetyltransfer- 
ase proteins in mouse and zebrafish genomes. Our genome-wide 
bioinformatics analysis identified a novel family of HATs, namely 
Camello proteins, which harbors the HAT domain. We demon- 
strated that Camello -family of proteins are active HATs and have 
specificity towards histone H4 acetylation. We also show that 
Camello proteins have perinuclear localization and their overexpres- 
sion leads to increased acetylation of histone H4. Finally, we demon- 
strated in vivo role of camello histone acetyltransferases by 
knockdown of CML03 in zebrafish embryos. Morpholino -mediated 
knockdown of CML03 exhibited defects in axis elongation and head 
formation, suggesting its critical role in zebrafish development. 

Results 

Genome- wide identification of HATs in mouse and zebrafish ge- 
nomes. The mouse genome sequence was searched for homologs of 
known histone acetyltransferases. Briefly, we employed a query set of 
HATs from all kingdoms of life as proteins harboring known HAT 
domains and previously classified e.g. GCN5. A total of 293 HAT 
domain -containing proteins were identified from all kingdoms of life 
and their homologs were surveyed in the mouse proteome database. 
After removing redundant sequences and false positives, we obtained 
33 putative HAT-like proteins in the mouse proteome. These 33 
putative HAT-like proteins are encoded by 21 mouse genes indi- 
cating presence of isoforms for few of these proteins. Phylogenetic 
analysis of these 33 HATs revealed that there are groups of proteins 
from MYST family, GNAT family (GCN5/PCAF, ATAC2, ARD), 
P300/CBP family as well as a group of proteins which are not yet 
classified as HATs (Figure lA). Most of the proteins in unclassified 
group are Camello proteins and sequence alignment of their acetyl- 
CoA binding domain with the GCN5 suggests that acetyl- Co A 
binding motif is conserved between them (Figure IB). Further, to 
confirm that the Camello group of proteins also occur in other chor- 
dates, we performed similar analysis in zebrafish and identified 19 
putative HAT proteins encoded by 18 genes suggesting their 
conservation in zebrafish genome (Additional file 1: Figure SI). 

Next, we analyzed sequence features of the Camello proteins, 
which are smaller in size as compared to other HAT families. 
CML03, a representative example of one of the Camello proteins 
is 226 amino acids long and the HAT domain extends from 78 to 204 
amino acids (Figure IC). The subcellular localization of CML03 is 
predicted to be nuclear (with low confidence) and likely to have two 
transmembrane helices between 37-56*'^ and 60-77* positions of 
sequence by PSORT and Phobius web-servers, respectively. Pre- 
sence of two transmembrane helices suggests that CML03 is likely 
to be a nuclear membrane protein. This in turn suggests that C- 
terminal region of the CML03 containing HAT domain is nuclear 
and can still access nuclear histones for acetylation. Interestingly, the 
amino acid sequence between 61 to 84 (partially overlaps with the 
transmembrane helix) positions also shows similarity to the homeo- 
box signature motif suggesting putative DNA-binding activity of the 
protein. Thus, our genome-wide bioinformatics analysis identified a 



novel family of HATs, which harbors the HAT domain but has no 
other associated domains. 

Recombinant Camello proteins exhibit histone acetyltransferase 
activity specific for Histone H4. Next, to test the enzymatic activity 
of the Camello proteins identified in our bioinformatics analysis, we 
expressed recombinant full-length mouse Camello proteins 
(CML03, CML02 and E0CRY6) in bacterial cells and purified 
them to homogeneity (Additional file 2: Figure S2). HAT assays 
were performed using purified recombinant Camello proteins and 
histones H3 and H4 as substrates to study the functional status of the 
proteins. We found that Camello proteins (E0CYR6 and CML03) 
are active histone acetyltransferases with substrate preference for 
histone H4 in HAT assay (Figure 2 A, lanes 2 and 3). However, 
CML02 did not exhibit any HAT activity under the conditions 
employed (Figure 2 A, lane 4). There are two possibilities for the 
lack of acetylation of histone H4 by CML02, (i) it might act on 
other histones or non-histone proteins as a substrate, or (ii) it 
might require other partners or modifications of the substrate for 
its activity. Additionally, specific antibodies (H3ac, H4ac and 
H4K5ac, H4K8ac and H4K12ac) were used to cross-examine the 
acetylation levels on histone H3 and H4 by CML03 and E0CYR6 
(Figure 2B). Further, to identify the residues modified by CML03 
and E0CYR6, mass spectroscopic analysis was performed using 
histones H3 and H4 that were modified by CML03 and E0CYR6 
in vitro. Both CML03 and E0CYR6 catalyzed acetylation of multiple 
lysines (H4K5, H4K8, H4K12 and H4K16) on Histone H4 (Figure 2C 
and 2D). No acetylated residues on histone H3 could be identified 
using similar strategy. Thus, these in vitro assays demonstrated that 
both CML03 and E0CYR6 are indeed active histone acetyltrans- 
ferases with substrate preference for histone H4. 

Camello proteins CML03 and E0CYR6 exhibit perinuclear local- 
ization. Camello group of proteins in the pfam database are de- 
scribed as probable N-acetyltransferases (NAT) based on their 
sequence similarity to human Nat8 and Nat8l, which are reported 
to be aspartate acetyltransferases (AAT) specific to kidney and liver 
in human^^'^^. Furthermore, Nat8 and Nat8l are shown to be localized 
to the secretary pathways primarily in the endoplasmic reticulum 
and display lysine acetyltransferase activity^^. To investigate the 
probable subcellular localization of Camello proteins we generated 
CML03-GFP and E0CYR6-GFP fusion constructs and expressed 
them in HeLa cells. Our results suggest that like human NAT8, 
CML03 and E0CYR6 are not locaUzed in the endoplasmic 
reticulum (data shown for CML03, Figure 3A and Additional file 
3: Figure S3). Interestingly, we found that CML03 and E0CYR6 
primarily exhibit perinuclear localization (data shown for CML03, 
Figure 3B and 3C and Additional file 4: Figure S4). CML03-GFP 
fusion protein was also found in cytoplasm in cells exhibiting strong 
overexpression of CML03-GFP fusion protein. Moreover, localiza- 
tion of CML03 into nuclear compartment was also shown by West- 
ern blots after cytoplasmic and nuclear fractionation (Figure 3D). 
Furthermore, overexpression of CML03 and E0CYR6 in HeLa cells 
resulted in higher acetylation of endogenous histone H4 (Figure 3E, 
right panel lanes 2 and 3) suggesting the involvement of Camello 
proteins in acetylation of histone H4. To test whether the observed 
subcellular localization of CML03-GFP fusion protein is driven by 
bulky C-terminal GFP tag, we introduced his-myc tag separately at 
the C-terminal and N-terminal of CML03. We found that his-myc 
tagging of CML03 at C-terminal recapitulated the localization 
profile of the CML03-GFP fusion protein and its overlap with 
lamin Bl, marker of the inner nuclear membrane and further 
strengthened our observation that CML03 is indeed localized in 
the nucleus (Additional file 5, Figure S5). However, N-terminal 
tagging of CML03 with his-myc tag affected the localization of 
CML03 and it was found mostly as punctuate structures (Addi- 
tional file 5, Figure S5). Such localization pattern of CML03 could 
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Figure 1 | Complement of HAT homologs in mouse genome. (A) The dendrogram was constructed using the UPGMA method of MEGA version 
5.1 after 100 bootstrap resampling. Dendrogram of HATs in mouse classified into known and new family containing Camello proteins. Homologs of 
different HAT families are represented in different colors. The new family of HAT "Camello family" identified in this study is highlighted in grey. 
(B) Multiple sequence alignment of seven protein sequences belonging to Camello family with GCN5 shows presence of acetyl-coA-binding motif in all 
the members of Camello family, (C) General domain organization of GCN5 and Camello protein, CML03. HAT domain of GCN5 is tethered with 
additional domains whereas CML03 has just the HAT domain. 



SCIENTIFIC REPORT: | 4 : 6076 | DOI: 1 0.1 038/srep06076 



3 



H4 



H4 



>- 
u 
o 



no 
O 



Western blot 



X 

+ 



— ^ f 



2 3 
Ponceau stain 



kDa 
-15 
J-10 



WB 
H3ac 

H3 

H4K5,K8,K12ac 

H4ac 
H4 



Sequence 


Modifications (acetylation) 


XCorr 


Charge 


CML03 


E0CYR6 


GLGkGGAkR 


K12and K16 


2.88 


2 


X 


X 


GkGGKGLGkGGAkR 


K5and K12and K16 


3.25 


3 


X 


X 


GkGGkGLGkGGAkR 


K5 and K8 and K12 and K16 


4.20 


2 


X 


X 



GLGkGGAkR (kl2 and K16) 



Expected Mw - 927.06 
Observed Mw- 927.53 



GkGGKGLGkGGAkR (K5 and K12 and K16) 

Expected Mw -1396.59 
Observed Mw -1396.79 



GkGGkGLGkGGAkR (K5 and K8 and K12 and K16) 

Expected Mw -1438.62 
Observed Mw -1438.81 



Figure 2 | Camello family proteins CML03 and E0CYR6 are active histone acetyltransferases. (A) HAT assay using baculo produced recombinant 
histones, H3 and H4. E0CYR6 and CML03 are showing histone H4 acetylation (lanes 2 and 3, respectively). (Full-length blot is provided as Additional 
fileS: Figure S7) (B) Western blots using site-specific antibodies (H3ac, H4ac and H4K5, 8, 12ac). E0CYR6 and CML03 are specific towards histone H4 
acetylation and have no activity on histone H3. All the samples were run on 15% SDS-polyacrylamide gels under similar conditions. (C) Mass 
spectrometric analysis of gel-cut histones, H3 and H4. Both CML03 and E0CYR6 were found to acetylate multiple lysines residues (H4K5, H4K8, H4K12 
and H4K16) on Histone H4. First column indicates the mass spectral peptide. Second column indicates the position of the lysine residues on histone H4. 
Xcorr is the cross-correlation value, Xcorr above 2.0 are considered as good correlation. Charge is state of peptide. X in CML03 and E0CYR6 columns 
indicate the positive acetylation by these two proteins. (D) Complete mass spectrum of intensity vs. mass-to-charge ratio of the modified peptides. 



be presumably due to interference or hindrance of its N-terminal 
transmembrane helix by the his-myc tag. This in turn indicates that 
Camello proteins are indeed different than NATS protein as they have 
different functions and subcellular localization. Collectively, our analysis 
suggests that CML03 and E0CYR6 have perinuclear localization and 
their overexpression leads to higher acetylation of histone H4 in vivo. 

Camello family of HATs is conserved across chordates but origi 
nated in cnidaria. To study the origin, conservation and evolution of 



Camello-family of proteins we performed sequence homology 
search for CML03 along the length in 23 representative eukaryotic 
genomes (Additional file 6: Table SI). Out of the 450 homologous 
proteins detected by jackhammer in 23 genomes, only 46 proteins 
aligned along the length of CML03 protein (Additional file 7: Figure 
S6). Representative species were selected to construct multiple 
sequence alignment (MSA) for CML03 proteins. Figure 4A shows 
the overall conservation of several positions and the location of 
predicted transmembrane helices in MSA. The green boxes depict 
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Figure 3 | CML03 has perinuclear localization. (A) CML03 is expressed as GFP fusion protein in HeLa cells. Co-localization of CML03-GFP fusion 
protein was compared with endoplasmic reticulum marker, calnexin. CML03 is not co-localized with endoplasmic reticulum marker. (B, C) CML03 has 
perinuclear localization and it co-localizes with the perinuclear membrane marker, Lamin Bl. The line in the merged image denotes a linear fluorescence 
profile of co-localization. CML03-GFP fusion protein was also present in cytoplasm in cells having strong overexpression of CML03-GFP fusion protein. 
(D) Western blot analysis of CML03-GFP overexpressing HeLa cells after cytoplasmic and nuclear fractionation. Lamin Bl and tubulin were used as 
markers of nuclear and cytoplasmic fractions respectively. CML03-GFP was mainly observed in nuclear fraction. C - Cytoplasmic fraction; N- Nuclear 
fraction. (Full-length blot is provided as Additional file9: Figure S8) (E) Overexpression of CML03 and E0CYR6 leads to higher acetylation of 
endogenous histone H4. Tubulin was used as a loading control. Bar graph on the right represents the quantification of the relative protein intensity of the 
gel image by normalizing with tubulin. Overexpression of CML03 and E0CYR6 leads to 1.5- and 2.5-fold enrichment of endogenous hitsone H4 
acetylation, respectively. 



location of transmembrane helices predicted in the CML03 protein 
sequence (Fig. 4A). The transmembrane helices show less conser- 
vation except at a few positions. The location of acetyltransferase 
domain is shown in pink colored box. Acetyl co-A-binding motif 
is shown with black solid line suggesting possibility of catalytically 
active protein sequences. CML03 homologs are present in all 
chordates analyzed in this study including earliest chordate 



Branchiostomafloridae. The only non-chordate genome that is an 
exception to this trend is the cnidarian Nematostella, which en- 
codes full length CML03 homolog. These results suggest the early 
origin of CML03 proteins in metazoan phylogeny. Furthermore, 
CML03 protein is absent in Drosophila melanogaster, Caenorhab- 
ditis elegans and Strongylocentrotus purpuratus. However, compa- 
rative genomic analysis suggests that Cnidarians are closely related to 
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Figure 4 | Multiple sequence alignment and Neighbor-joining tree of 25 CML03 homologs. (A) Proteins that showed significant similarity with 
CML03 across entire length were selected for MSA construction. Green and pink colored boxes depict location of predicted transmembrane helix and 
acetyltransferase domain in CML03 protein sequence, respectively. Color shading of MSA is based on clustalx. Positions are colored if they have more 
than 30% amino acids conserved. (B) Phylogenetic tree was constructed using neighbor-joining method for 221 positions. Statistical significance was 
assessed using 100 bootstraps. Two groups of CML03-like proteins clearly demarcated on the basis of Phylogenetic tree are shown in brown and navy blue 
color. Tree is rooted with ATS042_NEMVE sequence. Sequence identifiers can be used as identifiers for species as follows, HUMAN: Homo sapiens^ 
MOUSE: Mus musculus, CHICK: Gallus gallus, XENTR: Xenopus tropicalis, DANRE: Danio rerio, BRAFL: Branchistoma floridae, and NEMVE: 
Nematostella vectensis. 



SCIENTIFIC report: | 4 : 6076 | DOI: 1 0.1 038/srep06076 



6 



A 





1 00% 1 



n=220 



n=180 



n=160 



80% - 



60% - 



O 

b— 

-D 

E 40% 



20% ■ 



I 



n Deformed 
° Dead 
Q Normal 



o%- 



ControlMO CML03M01 CML03 M02 



Figure 5 | Morpholino mediated knockdown of CML03 in zebrafish embryos. (A) Lateral views of 18 hpf embryos injected with control MO, CML03 
MOl and CML03 M02. (B) Stacked bar graph representing the percentage of embryos which are normal, dead or deformed after injecting control 
and CML03 morpholinos. Control (n = 220), CML03 MOl (n = 180) and CML03 M02 (n = 160) injected embryos were analyzed in this experiment, 
where n represents numbers of embryos injected. 



the chordates than that of Drosophila melanogaster and Caenor- 
habditis elegans. Previous studies also suggest heavy gene loss in 
these two organisms^^'^^. This in turn would suggest that CML03 
might have emerged in Nematostella but subsequently lost in other 
invertebrates 

Phylogenetic tree of these 25 proteins clearly demarcate two 
groups of CML03 proteins which are likely to have diverged in 
gnathostomes from a single ancestral gene (Figure 4B). Bootstrap 
support for this split is 92. This suggests probable duplication of 
CML03 protein in the last common ancestor of Danio rerio and 
extant gnathostomes. The split between these two groups is evident 
using maximum-likelihood method with bootstrap support of 89. 
However, statistical support for other clades varies. Thus, CML03- 
like proteins seem to have first appeared in cnidarians with sub- 
sequent loss in few invertebrates and subsequently acquired by all 
chordates. 

Morpholino mediated knockdown of CML03 in Zebrafish results 
in defective axis elongation and head formation. Our study 
suggests that Camello family of HATs is evolutionary conserved in 
all chordates and appeared for the first time in Cnidarians. The 
acquisition of Camello proteins coincides with the emergence of 
Cnidarians, suggesting that Camello proteins might have evolved 
for performing functions related to pattern formation during 
gastrulation. We therefore used zebrafish as a model system to 
study the in vivo functional relevance of CML03 protein in early 
embryonic development. To downregulate CML03 two antisense 



morpholinos were used which can target its mRNA sequence 
spanning the first methionine codon. Knockdown upon injection 
of 4 ng of control and two separate CML03 morpholinos led to 
defects in axis formation (Figure 5), which is typically seen in 
embryos with abnormalities in convergent extension^^. CML03 
morpholino injected embryos showed shortened body axis, 
deformed notochord, and short trunk and tail at 18 hours post- 
fertilization (hpf). These observations suggest potential role of 
CML03 protein in convergent extension process in zebrafish. 
Further studies would help to understand the mechanisms by 
which CML03 is involved in this process. 

Discussion 

In this study, we performed classification of HATs into various fam- 
ilies on the basis of protein sequences and domain organizations^'^'^\ 
We could identify all the known families of HATs based on the 
presence of HAT domain in mouse and zebrafish genomes. Our 
genome-wide bioinformatics analysis also identified a group of 
HAT proteins namely Camello proteins and we wondered if Came- 
llo proteins represent a distinct novel family of HATs. Properties 
including absence of characteristic feature of the GNAT family and 
the p300/CBP family (bromo domain) and the MYST family 
(chromo domain), relatively small size of Camello proteins, absence 
of any other associated domain and their association with the nuclear 
membrane all indicate that Camello family might represent a novel 
family of HATs. 
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To our knowledge, Camello proteins are the only HATs exhibiting 
perinuclear localization. Our analysis also indicates that CML03 has 
the homeobox signature motif suggesting its DNA-binding activity. 
The nuclear pore-complex (NPC) in perinuclear membrane is con- 
sidered a key regulator of the molecular trafficking between cyto- 
plasm and nucleus. However, various experimental evidences 
suggest that the NPC also acts as bridge between nuclear transport 
and gene regulation by a phenomenon known as 'gene-gating'^^. The 
cross talks between chromatin remodeling, transcription and mes- 
senger-RNA export at NPC can act as a checkpoint for precise gene 
expression^^. The transcription and export factor, Susl is one such 
example, which is a stable component of SAGA, the histone acetyl- 
transferase complex, and TREX-2, the nuclear pore associated 
transcription and export complex^^. Susl functions in conjugation 
with Nucleoporins, Nupl and Nup60 to recruit mRNA export 
machinery at the NPC, which are also the sites for physical tethering 
of the actively transcribed gene^^. Based on above observations, we 
speculate that Camello proteins might mediate a dynamic attach- 
ment between transcriptionally active chromatin and the nuclear 
matrix and assist in gene regulation. Further, systematic investiga- 
tion of the function of Camello proteins should reveal novel func- 
tions in the regulatory repertoire of the multi- faceted HAT 
complexes. 

Camello family of HATs is evolutionary conserved in all chordates 
and appeared for the first time in Cnidarians. The acquisition of 
Camello proteins coincides with the emergence of Cnidarians, the 
early branching multicellular organisms with two germ layers, body 
axis and pattern formation^^. Till date, CML03 is only studied in 
Xenopus leviSy where over expression of CML03 leads to inhibition of 
gastrulation^^, suggesting that Camello proteins might have evolved 
for performing functions related to pattern formation during gastru- 
lation. Interestingly, CML03 protein is absent in Drosophila mela- 
nogaster, Caenorhabditis elegans and Strongylocentrotus purpuratus 
presumably due to heavy gene loss in these organisms Thus, 
CML03 might have emerged in Nematostella but subsequently lost 
in other invertebrates and have functions associated with pattern 
formation. 

The correlation between the histone acetylation and gene express- 
ion is crucial to determine the fate of embryo during embryonic 
development. Knockout of lysine actyltransferases p300, GCN5 
and CBP is shown to exhibit embryonic lethality in mice^. p300 null 
mice exhibit defect in neurulation and heart development. CBP is 
required for neural tube closure and hematopoietic differentiation 
and GCN5 null mice exhibits defects in neural tube closure and 
development of mesodermal lineages^. This in turn suggests that 
independent HATs are required for specific gene expression pro- 
gram during the embryonic development. Similarly, knockdown of 
Camello protein (CML03) in zebrafish is embryonic lethal and 
exhibited defects in axis elongation and head formation, suggesting 
its critical role in pattern formation during early embryogenesis. This 
effect might be primarily because of abnormalities in convergent 
extension process in zebrafish embryos. Essential role of CML03 
in zebrafish development could be attributed to the fact that many 
of the HATs are shown to exert a high degree of functional specificity 
during the early embryonic development, where each HAT plays 
independent and distinct role in stage-specific manner^^. 

Conclusion 

In this study we have identified a novel family of HATs and demon- 
strated that it is specific for histone H4 acetylation. Our study is the 
first demonstration of any HAT in the perinuclear membrane. 
Camello family proteins are evolutionarily conserved in all the chro- 
dates and originated from cnidarians. We also found that Camello 
proteins are essential for zebrafish development. Knockdown of 
CML03 in zebrafish leads to defects in axis elongation and head 
formation, suggesting its critical role in pattern formation during 



early embryogenesis. Focus of our future research would be to dissect 
the molecular mechanism associated with CML03 mediated acety- 
lation and its effect on transcription, mRNA export and nuclear 
organization. 

Methods 

Protein sequence retrieval and bioinformatics analyses. The complete set of protein 
sequences from the ORFs of Mus musculus and Danio rerio have been obtained from 
UNIPROT database. The two genomes were then surveyed for putative histone 
acetyltransferases (HAT) against known HAT domains present in pfam database 
using sensitive sequence homology search algorithms such as BLAST^*^ and 
PSIBLAST^^ with stringent e-value 0.0001. The hits identified were further pruned 
using the CD-HITS program^" to eliminate the redundant sequences which are 100% 
identical. Truncated sequences were further removed from the analysis. Hits which 
were lacking significant sequence similarity with the query were further examined 
manually and submitted to fold prediction method such as Phyre^^ for the 
compatibility of sequence with 3D structural fold of HATs. The final data set of 33 
putative mouse HATs were further analyzed for the associated domains which are 
identified using the HMMer^^ against the Pfam database^^ 

Homologs of mouse CML03 protein and phylogenetic analysis. Homologs of 
CML03 protein of mouse were fished by searching against the database of protein 
sequences from 23 completely sequenced eukaryotic organisms (Additional file 6: 
Table SI). Details of homology search and phylogenetics analysis are provided in 
Supplementary Information. 

Cloning of CMLO genes and protein expression. Full-length CMLO genes were 
PCR amplified from mouse embryo cDNA library (E8.5) using gene-specific primers. 
Amplified PCR products were cloned in pET28b (Novagen) in Ndel and Xhol 
restriction sites. The clones thus obtained were confirmed by DNA sequencing. 
CML03 proteins were expressed in E. coli (BL21) as 6X histidine tagged fusion 
proteins by induction with 0.4 mM isopropyl-B-D thiogalactopyrranoside (IPTG) 
for 6 h at 25°C. Proteins were affinity purified using Ni-NTA beads (Qiagen). 
CML03-GFP and E0CYR6-GFP fusion constructs were prepared by cloning CML03 
and E0CYR6 in pEGFP-Nl plasmid (Clontech). 

C-and N-terminal His-Myc tagging of CML03. C-terminal tagging of CML03 was 
performed by digesting pET28b-CML03 plasmid with Xbal and Xhol restriction 
enzymes. Resultant insert was ligated with Nhel and Xhol digested pcDNA3.1( — )/ 
myc-His A (Addgene) plasmid. N-terminal Myc-His tag CML03 was produced by 
cloning of His -CMLO 3 between Sail and Aflll restriction sites of pcDNA3. 1( — ) /myc- 
His A. 

Histone acetyltransferase assays. Camello protein (100 ng) was incubated for 
30 min at 30°C in HAT buffer (20 mM Tris-HCl, pH 8.8, 1.5 mM MgCl2, 10 mM 
NaCl and acetyl-CoA with 400 ng recombinant histones H3 and H4. Reactions were 
run on 15% SDS-PAGE and Western blot was performed using Pan-lysine acetyl (06- 
933, Upstate), H3ac (Ab47915), H4ac (06-866, Millipore) antibodies. Mass 
spectroscopic analysis of the proteins after HAT assays was performed at the W.M. 
Keck Biomedical Mass Spectrometry Laboratory, University of Virginia Health 
System, USA. 

Cell culture, transfection, immunostaining and microscopy. HeLa cells were 
grown in DMEM medium supplemented with 10% fetal bovine serum at 37°C with 
5% CO2. Details of transfection, immunostaining and microscopy are provided in 
Supplementary Information. 

Morpholino (MO) injections in Zebrafish. Tubingen (Tu) strain of zebrafish was 
used for morpholino injections. Control morpholino and CML03 morpholino were 
designed and obtained from Gene tools LLC. Four ng of 5 base mismatch control MO 
(5'-TCCCCTCCATGCACATAACACGAGA-3') and CML03 MOl (5' -GACGA- 
ATCTGCACCTCATCCATGAC-3') and CML03 M02 (5'-TCGCCTCGATCC- 
AGATAAGACGAGA-3') were injected at one cell stage in WT zebrafish embryos 
and pheno types were scored at 18 hpf For zebrafish maintenance and 
experimentation, the guidelines recommended by the Committee for the Purpose of 
Control and Supervision of Experiments on Animals (CPCSEA), Government of 
India, were followed. Zebrafish experiments were also approved by the Tata Institute 
of Fundamental Research (TIER), India. 

Ethics statements. Zebrafish experiments were performed in accordance with the 
guidelines of Committee for the Purpose of Control and Supervision of Experiments 
on Animals (CPCSEA), Government of India. All animal procedures carried out in 
this study were reviewed, approved, and supervised by the Institutional Ethics 
Committee of Tata Institute of Fundamental Research (TIER), India. 
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